Bayesian clustering for row effects models
نویسندگان
چکیده
We deal with two-way contingency tables having ordered column categories. We use a row effects model wherein each interaction term is assumed to have a multiplicative form involving a row effect parameter and a fixed column score. We propose a methodology to cluster row effects in order to simplify the interaction structure and enhancing the interpretation of the model. Our method uses a product partition model with a suitable specification of the cohesion function, so that we can carry out our analysis on a collection of models of varying dimensions using a straightforward MCMC sampler. The methodology is illustrated with reference to simulated and real data sets.
منابع مشابه
Nonparametric Bayesian Methods for Relational Clustering
An important task in data mining is to identify natural clusters in data. Relational clustering [1], also known as co-clustering for dyadic data, uses information about related objects to help identify the cluster to which an object belongs. For example, words can be used to help cluster documents in which the words occur; conversely, documents can be used to help cluster the words occurring in...
متن کاملLatent Dirichlet Bayesian Co-Clustering
Co-clustering has emerged as an important technique for mining contingency data matrices. However, almost all existing coclustering algorithms are hard partitioning, assigning each row and column of the data matrix to one cluster. Recently a Bayesian co-clustering approach has been proposed which allows a probability distribution membership in row and column clusters. The approach uses variatio...
متن کاملPAC-Bayesian Analysis of Co-clustering and Beyond
We derive PAC-Bayesian generalization bounds for supervised and unsupervised learning models based on clustering, such as co-clustering, matrix tri-factorization, graphical models, graph clustering, and pairwise clustering.1 We begin with the analysis of co-clustering, which is a widely used approach to the analysis of data matrices. We distinguish among two tasks in matrix data analysis: discr...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملIntegrated Classification Likelihood for Model selection in Block Clustering
Block clustering (or co-clustering or simultaneous clustering) aims at simultaneously partitioning the rows and columns of a data table to reveal homogeneous block structures. This structure can stem from the latent block model which provides a probabilistic modelling of data tables whose blocks arise from row and column clusters. For continuous data, each table entry is typically assumed to fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006